100 research outputs found

    Iterative Assessment and Improvement of DNN Operational Accuracy

    Get PDF
    Deep Neural Networks (DNN) are nowadays largely adopted in many application domains thanks to their human-like, or even superhuman, performance in specific tasks. However, due to unpredictable/unconsidered operating conditions, unexpected failures show up on field, making the performance of a DNN in operation very different from the one estimated prior to release. In the life cycle of DNN systems, the assessment of accuracy is typically addressed in two ways: offline, via sampling of operational inputs, or online, via pseudo-oracles. The former is considered more expensive due to the need for manual labeling of the sampled inputs. The latter is automatic but less accurate. We believe that emerging iterative industrial-strength life cycle models for Machine Learning systems, like MLOps, offer the possibility to leverage inputs observed in operation not only to provide faithful estimates of a DNN accuracy, but also to improve it through remodeling/retraining actions. We propose DAIC (DNN Assessment and Improvement Cycle), an approach which combines ''low-cost'' online pseudo-oracles and ''high-cost'' offline sampling techniques to estimate and improve the operational accuracy of a DNN in the iterations of its life cycle. Preliminary results show the benefits of combining the two approaches and integrating them in the DNN life cycle

    Hybrid Simulation and Test of Vessel Traffic Systems on the Cloud

    Get PDF
    This paper presents a cloud-based hybrid simulation platform to test large-scale distributed System-of-Systems (SoS) for the management and control of maritime traffic, the so-called Vessel Traffic Systems (VTS). A VTS consists of multiple, heterogeneous, distributed and interoperating systems, including radar, automatic identification systems, direction finders, electro-optical sensors, gateways to external VTSs, information systems; identifying, representing and analyzing interactions is a challenge to the evaluation of the real risks for safety and security of the marine environment. The need for reproducing in fabric the system behaviors that could occur in situ demands for the ability of integrating emulated and simulated environments to cope with the different testability requirements of involved systems and to keep testing cost sustainable. The platform exploits hybrid simulation and virtualization technologies, and it is deployable on a private cloud, reducing the cost of setting up realistic and effective testing scenarios

    Assessing Black-box Test Case Generation Techniques for Microservices

    Get PDF
    Testing of microservices architectures (MSA) – today a popular software architectural style - demands for automation in its several tasks, like tests generation, prioritization and execution. Automated black-box generation of test cases for MSA currently borrows techniques and tools from the testing of RESTful Web Services. This paper: i) proposes the uTest stateless pairwise combinatorial technique (and its automation tool) for test cases generation for functional and robustness microservices testing, and ii) experimentally compares - with three open-source MSA used as subjects - four state-of-the-art black-box tools conceived for Web Services, adopting evolutionary-, dependencies- and mutation-based generation techniques, and the pro- posed uTest combinatorial tool. The comparison shows little differences in coverage values; uTest pairwise testing achieves better average failure rate with a considerably lower number of tests. Web Services tools do not perform for MSA as well as a tester might expect, highlighting the need for MSA-specific techniques

    Assessing Operational Accuracy of CNN-based Image Classifiers using an Oracle Surrogate

    Get PDF
    Context Assessing the accuracy in operation of a Machine Learning (ML) system for image classification on arbitrary (unlabeled) inputs is hard. This is due to the oracle problem, which impacts the ability of automatically judging the output of the classification, thus hindering the accuracy of the assessment when unlabeled previously unseen inputs are submitted to the system. Objective We propose the Image Classification Oracle Surrogate (ICOS), a technique to automatically evaluate the accuracy in operation of image classifiers based on Convolutional Neural Networks (CNNs). Method To establish whether the classification of an arbitrary image is correct or not, ICOS leverages three knowledge sources: operational input data, training data, and the ML algorithm. Knowledge is expressed through likely invariants - properties which should not be violated by correct classifications. ICOS infers and filters invariants to improve the correct detection of misclassifications, reducing the number of false positives. We evaluate ICOS experimentally on twelve CNNs – using the popular MNIST, CIFAR10, CIFAR100, and ImageNet datasets. We compare it to two alternative strategies, namely cross-referencing and self-checking. Results Experimental results show that ICOS exhibits performance comparable to the other strategies in terms of accuracy, showing higher stability over a variety of CNNs and datasets with different complexity and size. Conclusions ICOS likely invariants are shown to be effective in automatically detecting misclassifications by CNNs used in image classification tasks when the expected output is unknown; ICOS ultimately yields faithful assessments of their accuracy in operation. Knowledge about input data can also be manually incorporated into ICOS, to increase robustness against unexpected phenomena in operation, like label shift

    Bug Localization in Test-Driven Development

    Get PDF

    An effort allocation method to optimal code sanitization for quality-aware energy efficiency improvement

    Get PDF
    Abstract-Software energy efficiency has been shown to remarkably affect the energy consumption of IT platforms. Besides "performance" of the code in efficiently accomplishing a task, its "correctness" matters too. Software containing defects is likely to fail and the computational cost to complete an operation becomes much higher if the user encounters a failure. Both performancerelated energy efficiency of software and its defectiveness are impacted by the quality of the code. Exploiting the relation between code quality and energy/defectiveness attributes is the main idea behind this position paper. Starting from the authors' previous experience in this field, we define a method to first predict the applications of a software system more likely to impact energy consumption and with higher residual defectiveness, and then to exploit the prediction for optimally scheduling the effort for code sanitization -thus supporting, by quantitative figures, the quality assurance teams' decision-makers

    RELIABILITY-ORIENTED VERIFICATION OF MISSION-CRITICAL SOFTWARE SYSTEMS

    Get PDF
    With software systems increasingly being employed in critical contexts, assuring high reliability levels for large, complex systems can incur huge verification costs. Critical system developers often encounter serious difficulties in satisfying reliability requirements at competitive and acceptable cost and time. Currently, it is not clear how engineers should plan an effective verification strategy oriented to improve the final reliability, since it is not trivial to figure out what activities mainly impact the reliability-cost trade-off and how much they affect reliability. Most often, crucial choices in the verification activity are left to the engineers� intuition, which base their decisions on personal expertise and on past experience, due to the lack of convincing approaches coping with them. However, when dealing with high reliability targets and tight time/cost constraints, engineers responsible for verification should have quantitative evidences of the consequences of their choices, and base their decision on them. One fundamental aspect in a reliability-oriented verification process concerns the identification of the most critical parts of the system, i.e., the major contributors to its unreliability. This is crucial to conveniently distribute efforts for verification. However, even suitably allocating efforts, engineers should know what verification techniques most impact the final reliability, and what techniques are most suited for the features of the system under test. Hence, the proper selection of verification techniques that best adapt to the specific system being developed is another critical challenge to be addressed. Coping with these issues, engineers could tune a verification process for their systems simply following a quantitative reasoning able to highlight cost/benefits of each choice. Based on these considerations, the thesis proposes a solution to carrying out an effective verification specifically oriented to improve reliability. It intends to provide engineers with quantitative means that should be adopted and embedded in their process, to allow them conveniently allocating efforts and selecting techniques for the system under test. The thesis first identifies the major open challenges to be faced, by trying to figure out what are the most crucial steps that engineers need to take for an effective planning. Then, to cope with them, it proposes: i) an optimization model to allocate verification effort to different system components in order to achieve a required reliability level at minimum verification costs; ii) an approach, based on empirical analyses, to quantitatively support the selection of the best verification techniques; iii) a procedure to improve verification processes in the considered class of systems, able to iteratively refine results across the developed projects
    • …
    corecore